Sparsity-Aware Orthogonal Initialization of Deep Neural Networks

نویسندگان

چکیده

Deep neural networks have achieved impressive pattern recognition and generative abilities on complex tasks by developing larger, deeper models, but these are increasingly costly to train implement. There is in tandem interest develop sparse versions of powerful models post-processing with weight pruning or dynamic training. However, it requires expensive train-prune-finetune cycles compromises the trainability very deep network configurations. We introduce sparsity-aware orthogonal initialization (SAO), a method initialize maximally connected weights. SAO constructs topology leveraging Ramanujan expander graphs assure connectivity assigns weights attain approximate dynamical isometry. Network sparsity tunable prior model compared fully-connected demonstrated that outperform magnitude up thousand layers fewer computations training iterations. Convolutional special constraints, while kernel may be interpreted as tuning level. Within framework, kernels pruned based desired compression factor rather than post-training parameter-dependent heuristics. well-suited for applications tight energy computation budgets such edge computing tasks, because achieves sparse, trainable learnable parameters without requiring layers, additional training, scaling, regularization. The advantages attributed both its initialization.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On weight initialization in deep neural networks

A proper initialization of the weights in a neural network is critical to its convergence. Current insights into weight initialization come primarily from linear activation functions. In this paper, I develop a theory for weight initializations with non-linear activations. First, I derive a general weight initialization strategy for any neural network using activation functions differentiable a...

متن کامل

Learning Structured Sparsity in Deep Neural Networks

High demand for computation resources severely hinders deployment of large-scale Deep Neural Networks (DNN) in resource constrained devices. In this work, we propose a Structured Sparsity Learning (SSL) method to regularize the structures (i.e., filters, channels, filter shapes, and layer depth) of DNNs. SSL can: (1) learn a compact structure from a bigger DNN to reduce computation cost; (2) ob...

متن کامل

Initialization Matters: Orthogonal Predictive State Recurrent Neural Networks

Learning to predict complex time-series data is a fundamental challenge in a range of disciplines including Machine Learning, Robotics, and Natural Language Processing. Predictive State Recurrent Neural Networks (PSRNNs) (Downey et al., 2017) are a state-of-the-art approach for modeling time-series data which combine the benefits of probabilistic filters and Recurrent Neural Networks into a sin...

متن کامل

SparCE: Sparsity aware General Purpose Core Extensions to Accelerate Deep Neural Networks

Deep Neural Networks (DNNs) have emerged as the method of choice for solving a wide range of machine learning tasks. Satiating the enormous growth in computational demand posed by DNNs is a key challenge for computing system designers and has most commonly been addressed through the design of custom accelerators. However, these specialized accelerators that utilize large quantities of multiply-...

متن کامل

Weight Initialization of Deep Neural Networks(DNNs) using Data Statistics

Deep neural networks (DNNs) form the backbone of almost every state-of-the-art technique in the fields such as computer vision, speech processing and text analysis. The recent advances in computational technology have made the use of DNNs more practical. Despite the overwhelming performances by DNN and the advances in computational technology, it is seen that very few researchers try to train t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Access

سال: 2023

ISSN: ['2169-3536']

DOI: https://doi.org/10.1109/access.2023.3295344